auxiliary function
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
3b91129cf07287aac3de7b8adba2196f-Supplemental-Conference.pdf
As we choosec2 = 2 and c3 = 0, all the conditions in Theorem 2.2 are satisfied. Therefore, the exponential stability of the zero solution is assured. ApplyingItˆo'sformulato log x yields: log x log x(0) = Convergence Time for ESFrom the conditions in Theorem 2.2 and the equivalent condition (10),(11), the hyperparameterb for ES can be substantially regarded as A.3.5 MoreDiscussionofAS/ES Understanding ASLoss We utilize the formula x 2(2 x,F(x) + G(x) 2F) (2 α) x G(x) 2 q(x) in (3) to construct the AS loss. Here we explain this term in more detail. Particularly, we sample5,000pointsfrom ( ρ1,, ρ20) ( θ1,, θ19) U([0,5]20) U([ 5,5]19), and get the information of the dynamics based on system(18) with (19) and (20).
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems
Recurrent neural networks (RNNs) are powerful models for processing time-series data, but it remains challenging to understand how they function. Improving this understanding is of substantial interest to both the machine learning and neuroscience communities. The framework of reverse engineering a trained RNN by linearizing around its fixed points has provided insight, but the approach has significant challenges. These include difficulty choosing which fixed point to expand around when studying RNN dynamics and error accumulation when reconstructing the nonlinear dynamics with the linearized dynamics. We present a new model that overcomes these limitations by co-training an RNN with a novel switching linear dynamical system (SLDS) formulation. A first-order Taylor series expansion of the co-trained RNN and an auxiliary function trained to pick out the RNN's fixed points govern the SLDS dynamics. The results are a trained SLDS variant that closely approximates the RNN, an auxiliary function that can produce a fixed point for each point in state-space, and a trained nonlinear RNN whose dynamics have been regularized such that its first-order terms perform the computation, if possible. This model removes the post-training fixed point optimization and allows us to unambiguously study the learned dynamics of the SLDS at any point in state-space.
LLM Collaboration With Multi-Agent Reinforcement Learning
Liu, Shuo, Chen, Tianle, Liang, Zeyu, Lyu, Xueguang, Amato, Christopher
A large amount of work has been done in Multi-Agent Systems (MAS) for modeling and solving problems with multiple interacting agents. However, most LLMs are pretrained independently and not specifically optimized for coordination. For example, existing LLM fine-tuning frameworks rely on individual rewards, which require complex reward designs for each agent to encourage collaboration. To address this challenge, we model LLM collaboration as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. We develop a multi-agent, multi-turn algorithm, Multi-Agent Group Relative Policy Optimization (MAGRPO), to solve it, building on current RL approaches for LLMs as well as MARL techniques. Our experiments on LLM writing and coding collaboration demonstrate that fine-tuning multiple LLMs with MAGRPO enables agents to generate high-quality responses efficiently through effective cooperation. Our approach opens the door to using MARL methods for LLM collaboration and highlights the associated challenges.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Why and How Auxiliary Tasks Improve JEPA Representations
Yu, Jiacan, Chen, Siyi, Liu, Mingrui, Horiuchi, Nono, Braverman, Vladimir, Xu, Zicheng, Haramati, Dan, Balestriero, Randall
Joint-Embedding Predictive Architecture (JEPA) is increasingly used for visual representation learning and as a component in model-based RL, but its behavior remains poorly understood. We provide a theoretical characterization of a simple, practical JEPA variant that has an auxiliary regression head trained jointly with latent dynamics. We prove a No Unhealthy Representation Collapse theorem: in deterministic MDPs, if training drives both the latent-transition consistency loss and the auxiliary regression loss to zero, then any pair of non-equivalent observations, i.e., those that do not have the same transition dynamics or auxiliary value, must map to distinct latent representations. Thus, the auxiliary task anchors which distinctions the representation must preserve. Controlled ablations in a counting environment corroborate the theory and show that training the JEPA model jointly with the auxiliary head generates a richer representation than training them separately. Our work indicates a path to improve JEPA encoders: training them with an auxiliary function that, together with the transition dynamics, encodes the right equivalence relations.
- North America > United States (0.14)
- Europe > Belgium > Flanders > West Flanders > Bruges (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > Canada > Ontario > Toronto (0.04)
Towards Intuitive Human-Robot Interaction through Embodied Gesture-Driven Control with Woven Tactile Skins
Lam, ChunPing, Chen, Xiangjia, Wu, Chenming, Chen, Hao, Sun, Binzhi, Fang, Guoxin, Wang, Charlie C. L., Dai, Chengkai, Yam, Yeung
This paper presents a novel human-robot interaction (HRI) framework that enables intuitive gesture-driven control through a capacitance-based woven tactile skin. Unlike conventional interfaces that rely on panels or handheld devices, the woven tactile skin integrates seamlessly with curved robot surfaces, enabling embodied interaction and narrowing the gap between human intent and robot response. Its woven design combines fabric-like flexibility with structural stability and dense multi-channel sensing through the interlaced conductive threads. Building on this capability, we define a gesture-action mapping of 14 single- and multi-touch gestures that cover representative robot commands, including task-space motion and auxiliary functions. A lightweight convolution-transformer model designed for gesture recognition in real time achieves an accuracy of near-100%, outperforming prior baseline approaches. Experiments on robot arm tasks, including pick-and-place and pouring, demonstrate that our system reduces task completion time by up to 57% compared with keyboard panels and teach pendants. Overall, our proposed framework demonstrates a practical pathway toward more natural and efficient embodied HRI.
- Asia > China > Hong Kong (0.05)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)